Query Processing and Optimization

查询处理共分为三个过程:执行 (Parsing & Translation)、优化 (Optimization)、估计 (Evaluation)

Parsing & Translation

Measures of Query Cost

Selection

External Merge Sort

Join

(1) Nested-Loop Join
rθs,伪代码如下:

for each tuple tr in r do begin
	for each tuple ts in s do begin
		test pair (tr, ts) to see if they satisfy the join condition
		if they do, add tr·ts to the result
	end
end

(2) Block Nested-Loop Join

for each block Br in r do begin
	for each block Bs in s do begin
		for each tuple tr in Br do begin
			for each tuple ts in Bs do begin
				test pair (tr, ts) to see if they satisfy the join condition
				if they do, add tr·ts to the result
			end
		end
	end
end

(3) Indexed Nested-Loop Join

(4) Merge-Join

(5) Hash-Join

大致流程如下

将r划分成nh个块
将s划分成nh个块

for 0 to nh-1
	对r对应块中的元素建立索引Ri
	for 遍历s对应块中的元素
		检索索引Ri
		for 所有匹配的tuple in r
			join
		end
	end
end

假设 r,s 分别被划分了 nr,ns 个块,但当 nr,ns>M 超过内存的容纳空间时,则需要进行递归划分。

Evaluation

对一些单操作的复杂度有了解之后,可以使用物化 (Materialization) 和流水 (pipelining) 的方法将他们串起来

Optimization

Equivalence Rule(等价关系表达式)

  1. Conjunctive selection operations can be deconstructed into a sequence of indicidual selections σθ1θ2(E)=σθ1(σθ2(E))
  2. Selection operations are commutative σθ1(σθ2(E))=σθ2(σθ1(E))
  3. Only the last in a sequence of projection operations is needed, the others can be ommitted.ΠL1(ΠL2(...(ΠLn(E))...))=ΠL1(E)
  4. Selections can be combined with Cartesian products and theta joins.
    (1) σθ(E1×E2)=E1θE2
    (2) σθ1(E1θ2E2)=E1θ1θ2E2
  5. Theta-join operations (and natural joins) are commutative.E1θE2=E2θE1
  6. Natural join operatins are associative (E1E2)E3=E1(E2E3), Theta joins are associative in the following manner (E1θ1E2)θ2θ3E3=E1θ1θ3(E2θ2E3), where theta2 involves attributes from only E2 and E3.
  7. The selection operation distributes over the theta join operation under the following two conditions:
    (1) When all the attributes in θ0 involve only the attributes of one of the expressions (E1) being joined σθ0(E1θE2)=(σθ0(E1))θE2
    (2) When θ1 involves only the attributes of E1 and θ2 involves only the attributes of E2. σθ1θ2(E1θE2)=(σθ1(E1))θ(σθ2(E2))
  8. The projetion operation distributes over the theta join operation as follows:
    (1) if θ involves only attributes from L1L2: ΠL1L2(E1θE2)=(ΠL1(E1))θ(ΠL2(E2))
    (2) Consider a join E1θE2.
    (3) Let L1 and L2 be sets of attributes from E1 and E2 respectively
    (4) Let L3 be attributes of E1 that are involved in join condition θ, but are not in L1L2
    (5) Let L4 be attributes of E2 that are involved in join condition θ, but are not in L1L2
    ΠL1L2(E1θE2)=ΠL1L2((ΠL1L3(E1))θ(ΠL2L4(E2)))
  9. The set operations union and intersection are commutative E1E2=E2E1, E1E2=E2E1
  10. Set union and intersection are assocative (E1E2)E3=E1(E2E3), (E1E2)E3=E1(E2E3)
  11. The selection operation distributes over , and σθ(E1E2)=σθ(E1)σθ(E2)
  12. The projection operation distributes over union ΠL(E1E2)=(ΠL(E1))(ΠL(E2))

Cost Estimation(结果集大小估计)